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1. INTRODUCTION 

The COVID-19 virus, generated by the SARS-CoV-2 virus, has been a calamity for humanity, 
particularly in the health industry, economic sector and education sector [1]. For instance, a recent infection 
surge in India has prompted many families to seek care at home due to a scarcity of intensive care units [2]. 
COVID-19 is a terrible disease, and a significant number of individuals are dying each day. This sickness 
does not affect just one country, it affects the entire world [3]. The modern world is infested with COVID-19 
illnesses. Rapid and reliable detection of infected patients is critical in the fight against COVID-19. Not only 
that, but COVID-19 also affects the education sector where students have difficulty obtaining an education 
because they have to learn from home, which results in little progress for students or even no progress when 
learning from home [4], [5]. This has a significant impact on almost every element of one's life [6]. 

Several methods have been used to identify COVID-19 patients, including swab testing, rapid test, 
and antigens. Chest X-ray (CXR) and computed tomography (CT) are frequently used screening techniques 
that aid in the diagnosis of COVID-19 patients by comparing healthy and normal lungs, particularly when 
viral tests are unavailable [7]. Even screening with a chest X-ray is quicker and less expensive [8]. 

There are many previous research on the detection of COVID-19, such as that conducted by Umair. 
Using the transfer learning method, Umair created deep learning model capable of diagnosing Covid-19 
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patients from X-ray data, achieving an F1 score of 98.36 percent on ResNet-34 and 97.56 percent on 
ResNet50 with only 406 X-ray images [9]. In another study Ismael [10], using 380 X-ray images data 
achieved F1 score of 95.92% on ResNet50 features and SVM approch. Another study Imaduddin [11] using 
ResNet 50 obtained 92.3% accuracy, 93% Fl-score, 93% precision, 90.7% specificity and 99% sensitivity on 
X-ray datasets. Research conducted now will be compared with research that has been done before. 

The goal of this project is to develop a COVID-19 detection model using the transfer learning 
approach on multiple residual network architectures in order to train computers using weight models built on 
larger datasets. Once the learning process is complete, computers may be used to diagnose COVID-19, which 
is extremely beneficial for expediting screening in response to the current pandemic emergency. We 
conducted 10 studies using 10 different pre-training based on residual network, and pytorch framework to 
obtain even better accuracy scores. A breakdown of the paper's structure is provided below. We provide a 
brief summary of the dataset, data pre-processing, convolutional neural network (CNN) classifiers, and 
different pre-training strategies in section 2. Section 3 presents the experiment's results and discussion. 
Finally, based on the study in section 4, we draw a few conclusions. 


2. METHOD 

According to this study, the transfer learning method was used to classify x-ray image data in order 
to identify COVID-19 disease. Chemotherapy X-rays were used to conduct various studies with ten different 
residual network topologies, such as Covid 19. Models that have been pre-trained include ResNet50, 
RexNet100, SSL ResNet50, semi-weakly supervised learning (SWSL)ResNet50, Wide ResNet50, SK 
ResNet34, ECA ResNet50d, Inception ResNet V2, CSP ResNet50, and ResNest50d. The confusion matrix is 
used to construct the architecture, which includes data preparation, feature extraction, classification, and 
model evaluation. Accuracy, specificity, sensitivity, Fl-Score and precision are metrics that used to assess 
model effectiveness. Figure 1 depicts the methodology employed in this research. 


Transfer Learning 


Evaluation 


Figure 1. Research methodology 
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2.1. Dataset 

The research was carried out utilizing dataset of x-ray images that had been gathered from a variety 
of sources, the dataset available on Kaggle website for free [12], [13]. We only used two classes from the 
dataset, namely the normal class and the COVID class. The total number of images in the dataset was 13,808 
images, including 10,192 normal data and 3,616 COVID data. Detailed information about the dataset shown 
in Table 1, an example of a COVID-19 x-ray data has been shown in Figure 2(a), followed by an example of 
a normal x-ray data has been shown in Figure 2(b). 


Table 1. Lung X-Ray dataset 


No Class Number of images 
1 Normal 10.192 
2 Covid 3.616 
Total 13,808 


(b) 
Figure 2. X-Ray of the lungs (a) COVID and (b) normal 


2.2. Preprocessing 

Preprocessing process is a process that is carried out to transform data before the data is used as 
model learning information. The image quality of the dataset can be improved by preprocessing [14]. This 
process is useful for making the dataset produce more information to strengthen the classification model. The 
dataset is divided into two classes: normal class data and covid class data, with a total of 13,808 images 
consist of 10,192 normal images and 3,616 covid images. In this study, we resize the X-Ray image to a size 
of 224 x 224 and normalize it to ImageNet format with mean = [0.485, 0.456, 0.406] and std = [0.229, 0.224, 
0.225]. 


2.3. Transfer learning 

Transfer learning is one of the methods by which the knowledge gained by the CNN from one set of 
data is transferred to another set of data in order to accomplish a separate but related job that requires 
additional data [15]. Data is one of the most important components of a deep learning approach, and the lack 
of medical data or datasets is one of the most significant challenges for academics in medical-related 
research. Fortunately, the availability of medical data or datasets is improving. transfer learning has the virtue 
of not necessitating the use of big data sets, which is advantageous. Calculations become more accurate and 
less expensive. When a model has been previously trained on a big dataset, it is transferred to a new model 
that needs to be trained on fresh data that is smaller than the original dataset. Transfer learning is a type of 
machine learning technique. The initialization of CNN training with tiny datasets, including large-scale 
datasets that have been trained in a pre-training model, is performed for a specific task through this technique 
[16]. It is our goal in this study to either modify an existing model or use feature extraction from a previously 
trained model to train a classification model, after which we will perform fine-tuning in order to transfer 
information from a new dataset to the previously trained model with a small learning rate. 
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2.4. ResNets 

Deeper neural networks are challenging to train, Researchers have developed a residual learning 
approach for training networks that is significantly deeper than those previously employed. An alternative 
approach to examining the unreferenced function is for the researcher to explicitly reformulate the layer as a 
residual learning function with respect to the input layer. There is evidence that this residual network is less 
difficult to optimize and that it can achieve accuracy at considerably greater depths [17]. 


2.5. ResNext 

ResNext is a network architecture for image classification that is simple and highly modulated. 
By repeating the construction pieces that integrate a series of changes with the same topology, this network 
can be built up. As a result of this straightforward design, a homogenous multi-pronged architecture with 
only a few hyperparameters to define is created. We propose a residual learning strategy for training 
networks with multi-branched layers that is both efficient and effective. In addition to the depth and breadth 
dimensions, this technique reveals a new dimension, which is called cardinality (the size of the set of 
transformations), as a crucial issue to consider. This architecture demonstrates that increasing cardinality can 
improve classification accuracy even in situations of restricted complexity maintenance. Furthermore, while 
expanding capacity, increasing cardinality is more successful than increasing depth or width [18]. 


2.6. Wide ResNets 

A proven deep residual network is capable of scaling up to thousands of layers and still has 
performance improvements. Researchers provide a residual learning approach by using a wider residual 
network but reduced network depth. Very deep residual network training has the problem of reduced feature 
reuse, which makes these networks very slow to train. By reducing the depth and increasing the width of the 
residual tissue, this problem can be overcome [19]. 


2.7. RexNet 

Rank expansion networks are designed for bottleneck design in image classification models by 
expanding the input channel size of the convolution layer and replacing the ReLU6 activation function only 
after the first 1x1 convolution in each inverted bottleneck and also using other nonlinear functions such as 
ELU which is considered to improve accuracy [20]. 


2.8. ResNet SSL & ResNet SWSL 

Studying residual functions with reference to layer inputs, instead of studying unreferenced 
functions. Instead of expecting every few layers that are stacked directly to match the desired base mapping, 
the residual net lets these layers conform to the residual mapping. They stack leftover blocks on top of each 
other to form a network. This model utilizes semi-supervised learning to improve model performance. This 
approach brings important advantages to standard architectures for image, video, and fine classification [21]. 


2.9. ResNet CSP 

ResNet CSPNet is a convolutional neural network that applies the cross-stage partial network 
(CSPNet) technique to ResNet, which is type of convolutional neural network. CSPNet divides the base layer 
feature map into two pieces, which are subsequently combined through the use of a cross-stage hierarchical 
structure. Increased gradient flow through the network is made possible through the deployment of a split and 
merge approach. This network additionally contributes to gradient variability by incorporating feature maps 
from early and late network stages. It is estimated to reduce computations by 20% while maintaining or even 
improving accuracy [22]. 


2.10. SK ResNet 

SK ResNet is a ResNet version that makes use of a selective kernel unit rather than the traditional 
ResNet unit. All large kernel convolutions in the original bottleneck block of ResNet are replaced by the 
suggested Selective Kernel convolutions, which allows the network to select the most optimal receptive field 
size in an adaptive manner [23]. 


2.11. ECA ResNet 

Using the efficient channel attention (ECA) module, ECA ResNet is a ResNet variation network that 
is similar to the original ResNet. It is an architectural unit built on squeezing and excitation blocks that 
minimizes model complexity without reducing dimensions to only a handful of parameters while giving a 
significant gain in performance [24]. 
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2.12. ResNest 

ResNest is a ResNet version in which Split-Attention blocks are stacked on top of one another. The 
cardinal group representations are then combined along the channel dimensions to form the final 
representation. If the input and output feature maps have the same shape as the regular residual block, a 
shortcut connection is used to construct the final output of another Split-Attention block, just as it is in the 
standard residual block. In the case of blocks with strides, the required transformation is applied to the 
shortcut connection in order to align the output shape with the block shape. Because of the simplicity and 
unity of this architecture, it can be parameterized with only a few variables and is assessed to surpass 
EfficientNet in terms of accuracy versus latency trade-off in image classification, compared to other 
approaches [25]. 


2.13. Inception ResNet V2 

A convolutional neural network that is based on the Inception architecture family but also adds 
residual connections is described here (replacing the filter splicing stage of the Inception architecture). The 
expansion-filter layer (1 x 1 convolution without activation) that follows each Inception block is used to 
extend the dimensions of the filter bank before addition in order to make it more compatible with the input 
depth of the block. As a result of the Inception block's dimensional reduction, this is required to compensate. 
Inception network training was greatly expedited as a result of this concept, which resulted in training with 
residual connections [26]. 


2.14. Training 

This research was carried out using the Python programming language, PyTorch as a framework, 
and several specific libraries to run deep learning models as part of the process. During our research, we also 
used Google's cloud computing, which is equipped with a GPU, which reduces the amount of time required 
to develop a learning model. For the sake of this research, a random seed number of 42 was used to ensure 
that the trials in this study could be replicated with the same findings. As an example, we use the 
LogSoftmax activation function on the output layer, followed by a loss function that employs Negative Log 
Likability and optimizes using AdamW, followed by the EarlyStopping callback function, which is used to 
terminate the training process if the accuracy value has not decreased within a specified amount of time, 
presuming that the training process has converged. At patience = 2 for EarlyStopping in the adaptation phase 
and at patience = 5 for EarlyStopping in the fine-tuning phase, a learning rate of 0.001 for the adaptation 
phase and a learning rate of le-5 for the fine-tuning phase is obtained, respectively. During preprocessing, 
the data is enlarged to a size of 224x224 and then normalized to ImageNet format for use in the data training, 
validation, and test phases of the process. In the feature extraction process, we use pre-training that has been 
previously trained on large amounts of data from ImageNet. The pre-training models include ResNet50, 
RexNet100, SSL ResNet50, SWSL ResNet50, Wide ResNet50, SK ResNet34, ECA ResNet50d, Inception 
ResNet V2, CSP ResNet50, and ResNest50d. The feature extraction process is then completed using the pre- 
training model. As a result of the usage of pre-training, research can be made easier because we do not have 
to train feature extraction; this technique is referred to as "transfer learning." The only modification that we 
need to make to the transfer learning architecture is to update the head of the architecture to meet the needs of 
the dataset that we are utilizing as part of the transfer learning process. Due to the fact that the desired 
categorization result contains two polarities, two neurons are used at the output in this instance. 


2.15. Evaluation 

Evaluation is a method used to test the classification performance of a model. The evaluation of the 
model in this research uses a confusion matrix that will produce true positive (TP), false positive (FP), false 
negative (FN), and true negative (TN) values [27]. To evaluate the implemented model, we use several 
matrices such as accuracy, sensitivity, specificity, precision, and Fl-Score. The accuracy formula has been 
shown in (1), the sensitivity formula has been shown in (2), the specificity formula has been shown in (3), the 
precision formula has been shown in (4), and the Fl-Score formula has been shown in (5). 


TP+TN 


Accuracy = = anti (1) 
aS tay. te TP 

Sensitivity /recall = ——— (2) 
yee TN 

Specificity = — (3) 
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Precision = 


(4) 


TP+FP 


2 x (Precision x recall) 


F1 — Score = (5) 


recall+precision 


3. RESULTS AND DISCUSSION 

The testing phase is carried out utilizing Python as the programming language and pytorch as 
framework, we run program on Google collaboratory as the collaboration platform due to its high 
performance. The research was carried out utilizing a dataset of 13,808, which consisted of 3,616 data that 
had been infected with COVID and 10,192 data that had not been infected. Despite the fact that the dataset is 
highly uneven, the transfer learning approach can still produce quite decent results on the test set when 
applied to the test set. Table 2 shows the entire set of classification results from the training sessions 
conducted. 


Table 2. Classification results in the test set 


No Model Accuracy _ Precision _Fl-Score Sensitivity _ Specificity 
1 ResNet50 98.94 99.28 99.28 99.28 97.96 
2 RexNet100 97.05 98.36 98.00 97.65 95.36 
3. ResNet50 SSL 99.13 99.28 99.41 99.54 97.96 
4 SWSL ResNet50 99.28 99.41 99.51 99.61 98.33 
5 Wide ResNet50 98.41 99.08 98.92 98.76 97.40 
6 SK ResNet34 98.60 99.09 99.05 99.02 97.40 
7 ECA ResNet50d 98.65 99.15 99.09 99.02 97.58 
8 Inception ResNet V2 96.62 98.93 97.69 96.48 97.03 
9 CSP ResNet50 97.92 98.13 98.60 99.09 94.61 
10 ResNest50d 98.50 99.41 98.98 98.56 98.33 


The classification of X-Ray pictures of the lungs (as normal or COVID) on the testing dataset is 
shown in Table 2. It was discovered using SWSL ResNet50, which had an Fl-Score of 99.51 percent, and the 
model with the highest Fl-Score.In contrast, the model with the lowest Fl-Score is Inception ResNet V2, 
which received a score of 97.96 percent. Accordingly, SWSL ResNet50 has the most sensitivity with a score 
of 99.61 percent, whereas Inception ResNet V2 has the lowest sensitivity with a score of 96.48 percent. The 
SWSL ResNet50 and ResNest50d models have the highest precision and specificity scores, with precision 
scoring 99.41 percent and specificity scoring 98.33 percent, respectively. As a result, it can be inferred that 
the SWSL ResNet50 model is the most effective model for X-ray lung image classification. In previous 
research, Musleh and Maghari [28] using the CheXNet approch obtained accuracy of 89.7%, compared to 
this study with accuracy 99.28%. Wang et al. [29] using using the Covid-Net approach to detect covid in 
x-ray images obtained a sensitivity of 80%, compared to this study with 99.61% sensitivity, in addition 
Imaduddin et al. [11] using ResNet 50 to get 92.3% accuracy, 93% precision, 93% Fl-score, 99% sensitivity, 
and 90.7% specificity compared to the research conducted with 99.28% accuracy, 99.41% precision, 99.51% 
Fl-Score, 99.61% sensitivity, and 98.33% specificity. 


4. CONCLUSION 

It is challenging to identify COVID-19 disease because of the enormous amount of data that is used 
and the very uneven distribution of that data. Several transfer learning approaches are proposed in this paper, 
which are based on the residual network architecture and will be applied to data classification. Even though 
the residual network architecture is based on data from ImageNet datasets that contain less medical data, the 
transfer learning method can produce good results when applied to identifying medical picture data. When 
we use transfer learning, we don't have to start from scratch, we only have to change the last layer of our 
model architecture. When it came to identifying two types of lung X-ray data, our model performed well, 
with the SWSL ResNet50 model producing the best results, with accuracy of 99.28%, precision of 99.41%, 
Fl-Score of 99.51%, sensitivity of 99.61%, and specificity of 98.33%. 
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